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We consider extensive data on Spanish international trades and population composition and, 
through statistical-mechanics and graph-theory driven analysis, we unveil that the social network 
made of native and foreign-born individuals plays a role in the evolution and in the diversification 
of trades. Indeed, migrants naturally provide key information on policies and needs in their native 
countries, hence allowing firm’s holders to leverage transactional costs of exports and duties. As 
a consequence, international trading is affordable for a larger basin of firms and thus results in an 
increased number of transactions, which, in turn, implies a larger diversification of international 
traded products. These results corroborate the novel scenario depicted by “Economical Complex¬ 
ity”, where the pattern of production and trade of more developed countries is highly diversified. 

We also address a central question in Economics, concerning the existence of a critical threshold for 
migrants (within a given territorial district) over which they effectively contribute to boost interna¬ 
tional trades: in our physically-driven picture, this phenomenon corresponds to the emergence of a 
phase transition and, tackling the problem from this perspective, results in a novel successful quan¬ 
titative route. Finally, we can infer that the pattern of interaction between native and foreign-born 
population exhibits small-world features as small diameter, large clustering, and weak ties working 
as optimal cut-edge, in complete agreement with findings in “Social Complexity”. 

PACS numbers: 89.65.Ef, 89.65.-s, 05.40.-a, 05.70.Fh 


I. INTRODUCTION 

In this work we aim to merge recent findings in So¬ 
cial Complexity mm with those achieved in Economical 
Complexity m, in order to deepen our understanding 
of socio-economical behaviors observed in developed so¬ 
cieties. In particular, we examine the role of migratory 
fluxes on the economical diversification and international 
trading of the hosting countries. 

The emergence and the fitness of economical diversi¬ 
fication is nowadays still questioned: classical economic 
theories prescribe specialization of industrial production 
for more performing countries 13 E], while recent stud¬ 
ies [3 n [3 m show that diversification of products 
plays a key role in modern economies. Quoting Hidalgo, 
Klinger, Barabasi and Hausmann inspection of the coun¬ 
try databases of exported products shows that success¬ 
ful countries are extremely diversified, in analogy with 
biosystems evolving in a competitive dynamical environ¬ 
ment [3] . Oversimplifying, the key idea to explain such a 
diversification is that, if the factors (e.g., technology, cap¬ 
ital, institutions, skills) necessary for a country to pro¬ 
duce a good are (partially) shared with another good, it 
will be likely that both goods will be produced [3]. 

Here, we address a closely related problem: we inves¬ 
tigate the diversification of the production of a coun¬ 
try by looking at its exports and connecting diversifi¬ 
cation in trades with social complexity beyond econom¬ 
ical complexity. In particular, we quantitatively show 
that stocks of foreign migrants play a crucial role in the 
establishment of international trades of diversified prod¬ 


ucts, thus contributing to explain the genesis of the Hi¬ 
dalgo, Klinger, Barabasi and Hausmann picture. In a 
nutshell, our results (in agreement with recent literature 
isin] [43]), suggest that social interactions between na¬ 
tive and foreign-born populations allow transferring to 
local firms a crucial knowledge about policies, needs and 
duties existing in the foreign countries. Remarkably, this 
information, coupled with firms’ holder capabilities, per¬ 
mits to decrease the overall potential costs of trading thus 
allowing a larger number of firms to appear in the global 
market, which, in turn, implies broader and diversified 
trades. 

Thus, our claim is that the interaction network be¬ 
tween migrants and natives spreads the social capital (i.e. 
the collective resources of the community, including in¬ 
formation, expertise and skills) and this enhances the ex¬ 
tensive margin of trades, which, in turn, acts as a boost 
in the diversification of the exported products. 

In order to prove these statements, we introduce a 
statistical-mechanics scaffold (where data can be ratio¬ 
nally framed) and, step by step, we check for the empir¬ 
ical confirmation of our assumptions and our theoretical 
results, by analyzing the test case of Spain. In fact, this 
country has experienced a (well-documented) influx of 
migrants since 1998 with a very rapid increase during 
the period 2000 — 2008 [H [2j [TSHTT] and this constitutes 
an ideal context to investigate the role of immigrants in 
creating new trade relationships. 

More precisely, our work is structured as follows. 

In the first part, devoted to the statistical mechanical 
analysis, we introduce the simplest possible model (i.e.. 
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a minimal Hamiltonian) that relates two parties: foreign- 
born and native people living in a given district of the 
country. As a result of the interaction between the two 
parties, natives will -stochastically- decide whether to 
trade with the country of origin of immigrants. Remark¬ 
ably, we prove that this model belongs to the class of 
copying-mo del mi US], or single-party ferromagnets in 
the jargon of statistical physics, where native decision¬ 
makers alone come to play and they spontaneously be¬ 
have in an imitative way. Through this approach we are 
able to quantify the role of immigration in the volume of 
trades and to include this phenomenon in the framework 
of the phase transitions. Within this setting, we can also 
test empirically whether a critical mass of migrants in 
needed in order to ensure that a positive pro-trade ef¬ 
fect of migration exists niEn], in agreement with the 
pioneering suggestions by Gould [21] and, more recently, 
with the non-linear theories driven by Chaney’s distorted 
gravity scheme [22|. Our theoretical findings predict a 
non-linear dependence, encoded by an hyperbolic tan¬ 
gent, for exports to a given foreign country versus the 
percentage of immigrants hailing from that country, and 
are successfully checked by comparison with the Span¬ 
ish dataset. We conclude the first part of the paper by 
proving the existence of a net and robust correlation be¬ 
tween the degree of product-destination diversification of 
exports (measured in terms of the Herfindhal index) and 
the number of migrants as a fraction of the total popu¬ 
lation. 

Finally, our theory also allows us to infer the topologi¬ 
cal structure of the host society, and this is addressed in 
the second part of the paper. Interestingly, we find that 
the society displays small-world features and recovers the 
Granovetter theory of weak ties [231425] . Incidentally, we 
notice that this is also compatible with recent researches 
investigating the role of immigrant integration in labor 
markets [26] . 

II. RESULTS 

Before introducing our model, a few points must be 
clarified (and empirically proven to hold): 

• Our theory, developed within a classical statisti¬ 
cal mechanical perspective, is set at a microscopic 
level and it accounts for an ensemble of native “de¬ 
cision makers”, whose behavior (i.e., the propen¬ 
sity to undertake an international trade) can be af¬ 
fected by the interaction with migrants. However, 
the theoretical outcomes of such a model are com¬ 
pared with available data on international trades 
performed by firms: in principle, it is not obvious 
that we can switch from the microscopic level (i.e. 
decision makers), where the whole theory lies, to 
the mesoscopic level (i.e. firms), where the data 
analysis is performed. This is allowed if and only if 
there exists a linear proportionality between the to¬ 
tal population and the total amount of firms. Luck¬ 


ily, this is the case in Spain for the considered time 
window (1998-2012), as corroborated by empirical 
findings shown in Fig. Thus, as far as scalings 
are concerned, we can exploit the theoretical pre¬ 
dictions for the average behavior of decision makers 
(stemming from the statistical-mechanics model) to 
describe the expected attitude of firms (that we in¬ 
fer from empirical data). 



FIG. I: Each data point (blue bullet) represents the number 
of firms versus the population of a given Spanish province 
(out of 50) for a given year (in the interval 1998 — 2012). 
The linear proportionality of these quantities is highlighted 
by binned data (green squares), whose best fit is given by 
a linear law (red solid line) with slope ~ 1.02 ± 0.03 and 
goodness ~ 0.99. 


• The total amount of trades Y is usually defined in 

terms of two contributions: the amount of firms 
that perform international trading (i.e. extensive 
margin Yext) and the amount of money each firm 
moves in any transaction (i.e. intensive margin 
Yint), namely Y = Yext * Yint, or, in a logarith¬ 
mic scale, logT = log Yecct + logT^nt- Ghaney has 
shown that a reduction in fixed trade costs has a 
positive impact on Yext [H] ; Peri and Requena have 
shown that migrants have a positive effect on the 
extensive margin of trade in Spain, hence deriving 
that migrants facilitate trade mainly by reducing 
the fixed costs of exporting On the other 

hand, the intensive margin of trades seems to be 
poorly affected by migration stocks. Thus, our the¬ 
ory is actually devoted to capture the evolution of 
Yext- 

• The database available reports about the total vol¬ 
ume of transaction, that is log T. As a consequence, 
we first need to prove that the expected linear pro¬ 
portionality between logT and logYext is fulfilled 
by our data, such that, later, we will be authorized 
to analyze the evolution of log T as a function of 
migrant density inside the host country in order to 
extrapolate an analogous scaling for log Yext foo- 
This proportionality is robustly checked as shown 
in Fig.jg 
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FIG. 2: Each data point (blue bullet) represents the number 
of exporting firms Yext versus the overall extent of trades Y 
for a given Spanish province (out of 50) for a given year (in 
the interval 1998 — 2012). Binned data (green squares) are 
best-fitted by a straight line (red solid line) y — ax^ being 
a ~ 0.006, and E? 0.89. 


FIG. 3: Sketch of the bipartite network modeling mutual in¬ 
teractions between natives (left community) and immigrants 
(right community). The coupling between the native labeled 
as i and the immigrant labeled as y is denoted as , while the 
coupling between two natives labeled as i and j, respectively, 
is denoted as Jij. 


A. PART ONE: Insights from Statistical Mechanics 

First, we need to set a proper length-scale: as the 
migration-trade relation is known to be an in-province 
phenomenon [44] [T6|, we fix the degree of resolution at 
the provincial level. Then, for any arbitrary province, 
we denote with N its population and notice that the N 
individuals can be divided into two groups: Ni natives 
and N 2 foreign-born, being + A 2 = N. We also define 


measuring the relative size of the two groups and we in¬ 
troduce F = 7(1 — 7) too, the latter representing the 
normalized number of cross links between the two com¬ 
munities: note that for small 7 (and this is the case for 
Spain), F ^ 7. 

Moreover, we introduce variables (i.e. spins), referred 
to as and respectively, such that cfi G 

{ — 1 ,+ 1 } represents the propensity of the native agent 
i to establish (cr^ = + 1 ) or not establish (cr^ = — 1 ) a 
trade, while the variables represent the quantity of in¬ 
formation, either positive > 0 ) or negative {z^ < 0 ), 
that the /i-th immigrant can provide (regarding trading 
toward his/her country of origin). Otherwise stated, the 
ensemble represents the social capital of the im¬ 

migrant community and, in the absence of any additional 
information, in a mean-field approach, it can be thought 
of as a collection of Gaussian variables identically and 
independently distributed. 

The diffusion of the social capital and the decisional 
mechanism can be now described by an Hamiltonian (i.e. 
a cost function in economical vocabulary) 1-L{a^ z] J 
dependent on the couplings J and encoding for native- 
native interactions and for native-migrant interactions, 
respectively (see Fig. left panel). 

Now, let us inspect in more details the interaction pat¬ 
terns and the resulting Hamiltonian. 

The interaction between a native, say i, and a foreign- 
born, say /i, is encoded by the variable G {0,1} de¬ 
scribing the presence (^f = 1 ) or the absence (^f = 0 ) of 


a connection (e.g. friend, work-mate, acquaintance, fa¬ 
miliar) between i and ja. The set of variables ^ generates 
the topology of the social network between immigrants 
and natives. Since there exist nor detailed information 
about individual connections, neither a broadly accepted 
protocol for their measure, and checking that migratory 
fluxes are uncorrelated (i.e. the time-scales considered 
are long enough and migrants comes from a wide range 
of countries), the most basic assumption one can then 
pose is simply to consider the completely general set of 
as i.i.d. aleatory variables, extracted with probability 

p (7 = i) = i-p (7 = o) = T, ( 2 ) 

where 0 G (0,1), and ^ G are parameters province- 
dependent: in this way, properly tuning 0 and the 
network recovers all the standard regimes (e.g., extreme 
dilution, finite connectivity, etc.) and, by fitting these 
parameters over the available data, we can infer the topo¬ 
logical features of the actual Spanish network for the an¬ 
alyzed years [29] . 

Analogously, J describes the connections among na¬ 
tives and, at this stage, it can be assumed to be arbitrary 
but endowed with a well defined average value J (see also 
Part H for more details), which, in principle, depends on 
the province p. 

Therefore, at the provincial level of resolution, the sys¬ 
tem can be described by the Hamiltonian 

^ iVi 1 

1 (i,j) i=i ^=1 

(3) 

Note that in the second term in the r.h.s. of the above 
equation, the normalization factor ensures the 

linear extensivity of the Hamiltonian or, analogously, 
that the field hi acting on any spin is 0(1). In fact, 
hi = j^l-e the expected number of 

non-null entries in the vector namely the expected 
number of non-null terms in the sum, is just 

Before proceeding, we need to introduce a parameter (3 
to tune the degree of stochasticity in the system, in such 
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a way that for /3 ^ 0 the system behaves completely 
randomly, while as /3 ^ oo the system deterministically 
relaxes to the configuration corresponding to the mini¬ 
mum of the cost function. Thus, the partition function 
Z of the model defined by the Hamiltonian ^ reads as 


= E 

a 

= E 






M. 


J dfi{z)eN 


^ 

i-d 2^i=i 


gATi 




g ^2(1-0) 


(4) 


, (5) 


where we called dji^z) the standard Gaussian measure. 
Crucially, by a direct comparison of the arguments in the 
exponents of Eqs.|^and|^ respectively, we see that the bi¬ 
partite interactions between natives and immigrants (i.e. 
those oc ill line) are stored in 

an effective coupling Jij between couples of local decision 
makers alone (i.e. those oc in the 

second line). Such a coupling is Hebbian-like [29] as 


E N2 

u=l Si Sjf 


N2(l-0) 


(6) 


Therefore, the bipartite model described in Eq.j^is ther¬ 
modynamically equivalent to a monopartite ferromag¬ 
netic (i.e. with imitation among natives) model em¬ 
bedded in a random, diluted structure [29] (see Eig. 3, 
right panel). Despite the underlying graph is not fully- 
connected (and we will show later that, at least for the 
Spanish case, it is a small-world network), it is not under¬ 
percolated, hence the model still exhibits a phase transi¬ 
tion qualitatively analogous to the one pertaining to the 
Curie-Weiss scenario [30] [31] . 

The “order parameter” for this model is given by 
^ namely the fraction of individ¬ 

uals inclined to an international trade (i.e., the amount of 
spins positively aligned). This order parameter is equiv¬ 
alent (upon translation) to m{a) = ^ Ylf=i namely 
the standard magnetization of the system (in its ferro¬ 
magnetic interpretation m [28]). Now, it is worth re¬ 
calling that the linear proportionality between decision 
makers and firms in the Spanish provinces (see Eig. ^ 
allows inferring only scalings and proportionality rela¬ 
tions (but not exact values) for the amount of trading 
firms. Therefore, there is no loss of information in us¬ 
ing the (mathematically more convenient) m instead of 
M, and hereafter we will retain the former observable to 
quantify the extensive margin of trades Moreover, 

as explained in the previous section, the evolution in Y^xt 
can be related to the evolution of trades E as a whole. 

By applying the standard statistical-mechanical ma¬ 
chinery (see Appendix A for a detailed derivation), we 
attain the following self-consistent equation for m: 


m = tanh(;dJm + /3‘^^‘^Tm). (7) 


This is the main formula in this first part as, following 
the scaling m (x Y argued above, it relates the growth 


of trades with the percentage of migrants (we recall T = 
7(1 — 7) 7). The agreement between Eq. and the 

Spanish test case is reported in Eig. [^and deepened in 
the Data Analysis Section. 

Remarkably, Eq.j^also contains information regarding 
the critical percentage of migrants that must be reached 
before they start to influence new trade relationships. 
To extract such information, we exploit the statistical 
physics know-how of phase transitions: when the argu¬ 
ment of the hyperbolic tangent is smaller than one the 
only solution for Eq. is m = 0. However, as the argu¬ 
ment gets larger than one, non-zero solutions appear and 
we can expand the hyperbolic tangent as 

m - /3(J + - ^(J + + 0(m^), (8) 

and, excluding the paramagnetic solution (m = 0), we 
get 


(9) 

where a = 3^^/[/3{J + and = {1-/3J)/{/30‘^. 

Erom the previous equation we see that as far as T < Tc 
no real solution to this equation exists. Thus, when the 
percentage of migrants within a given province is smaller 
than Tc, trades can of course take place, but the related 
international market is not influenced by the presence of 
migrants within the province itself. 

Three important aspects of the relation between mi¬ 
gration and trading are thus coded in equation Eq. [^ 

• The relation between migrant density and growth 
of trades is non-linear, as these observables are re¬ 
lated via an hyperbolic tangent. 

• There exists a critical value for the fraction of mi¬ 
grants, that reads as 


1-PJ 


( 10 ) 


beyond which they start to have a net effect on 
international trading for the host province. No¬ 
tice that Tc is stochastic (via /3), and, in principle, 
province dependent through J and In fact, we 
stress that the previous derivation holds for any ar¬ 
bitrary province and, in general, the parameter set 
(A^, J,^) is province dependent, in such a way that 
the outline for m versus T as well as the critical 
value Tc vary with the province. However, note 
that, in principle, Tc can be vanishing. 


• There is a saturation effeet for large enough T as 
the hyperbolic tangent is a bounded function that 
eventually reaches a plateau. Exhaustion levels in 
bilateral exports have already been linked with mi¬ 
grant saturation effects as, for instance, in the ex¬ 
perimental works discussed in [9]. 
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1. Data Analysis 


We check our findings versus empirical data for the 
test-case of Spain. The overall dataset is obtained by 
merging two sources: trade data come from ADUANAS- 
AEAT dataset provided by Ministerio de Economia y 
Hacienda, and demographic data come from the Span¬ 
ish Statistical Office (INE). 

We consider the time series for exports {Yy^p} and for the 
fraction of immigrants { 7 ^,^}, along the range of years 
y = 1998,2012 and for the 50 provinces p = 1, ...50 
making up the country (EUROSTAT NUTS III defini¬ 
tion). Thus, our time range is made of Ny = 15 years 
and our geographic set is made of Np = 50 provinces. 
Preliminarily, as we start from historical series, we check 
that at least one of the observables Y and 7 is monoton- 
ically increasing with respect to the years and ^{y) 
satisfies this request. Thus, we are allowed to invert 
"f{y) y{l) look at the evolution of F as a function 
of 7 , so to obtain Y ( 7 ) that must then be suitably binned 
and averaged (see [T] for details on this procedure). 

The whole set of provinces constitutes our pool, 
namely we consider different provinces as independent re¬ 
alizations (or, otherwise stated, extractions) of the same 
system. This means that the trades of a given province 
are taken to depend only on the fraction of immigrants 
within the province itself. While there is general con¬ 
sensus on this, the consistency of such an hypothesis is 
shown in [32], where the authors prove that the proxim¬ 
ity (meant as geographical closeness) is fundamental for 
the diffusion of the social capital and therefore for the 
growth of trades. 

Eor each province p, we can measure the percentage 
of immigrants 7 ^ and plot Yp versus Tp ^ 7 ^, as shown 
in Eig. [^for some exemplary cases. Note that theoreti¬ 
cal predictions (see Eq. are in remarkable agreement 
with the empirical behaviour. We performed extensive 
fits over all the provinces available according to Eq. 
which we report hereafter as 


m = tanh 






( 11 ) 


where we highlighted the critical density Tc = (1 — 
h)/{P^)‘^ and we posed b = /3J. While fitting, an ex¬ 
tra, province-dependent, parameter referred to as a, has 
to be introduced in order to account for the fact that, 
due to the scaling between Y and m, the former is in 
principle not bounded. The best-fit coefficients are col¬ 
lected in Eig. Notably, we checked that these results 
are in full consistency with the analogous parameters 
that one would obtain when fitting with the more ex¬ 
plicit square root function (|^, at least as far as small 
values of T are considered. In particular, we notice that 
log (a) is roughly uniformly distributed along the range 
(12,19), suggesting that the extent of exports varies over 
several orders of magnitude, according to the province 
considered. On the other hand, Tc looks Poissonian-like 
distributed and is peaked around 0.003, suggesting that 



FIG. 4: Exports Y versus F for three different provinces as 
explained by the legend; we choose the three largest provinces 
for the sake of readability and for consistency with the analy¬ 
sis of the following sections, however, we checked that analo¬ 
gous plots hold also for the other provinces. In this plot each 
data point corresponds to a different year. The solid lines 
represent the best fit according to Eq. and the goodness 
of the fit is = 0.94 (Madrid), = 0.97 (Barcelona), and 
R^ = 0.95 (Valencia). 


when immigrants are less than 0.3% of the whole popu¬ 
lation inside the province, their presence is ineffective as 
facilitator of trade with their country of origin. 

Note that, through the statistical mechanics route of 
phase transitions, finding the critical mass is quite sim¬ 
ple, while via standard approaches accessing this quan¬ 
tity would be much more complex as Fc is a function of 
several local variables, as coded in Eq. 



b 


FIG. 5: Histograms for the best-fit coefficients Fc (upper 
panel), a (middle panel), and b (lower panel) obtained by 
fitting Yp versus Fp according to Eq. [E for each province 
p. Notice that, due to the broad range along which Yp (and, 
accordingly, a) spans, we represent the histogram of log(a). 


























6 


2. Bilateral trades 


In order to get a finer picture, and to deepen the pos¬ 
sible existence of a country-dependent critical threshold 
Fc, we fragment the migrant party into several subsets, 
each corresponding to a different country of origin and 
then we analyze the trades Ypj performed between any 
province p and any foreign country / as a function of 
the related fraction of immigrants Tpj. Of course, re¬ 
sults are expected to be much more noisy, as we are 
dealing with considerable smaller datasets and the in¬ 
trinsic fluctuations are only partially smoothened by the 
central limit theorem. Nonetheless, it is worth check¬ 
ing whether the previous results are still valid at this less 
coarse-grained level, and inferring the country-dependent 
critical masses. We focused on the three major Spanish 
cities, namely Madrid (Fig. [^, Barcelona (Fig. and 
Valencia (Fig. and on the foreign countries for which 
the size of immigrant communities are larger and span 
along a wide interval in the time window considered, in 
order to get more accurate and reliable fits. 


By fitting data according to Eq. 11 we derive estimates 
for Fc which, in general, depend on both p and /, as 
shown in Fig. In particular, Fc follows a distribution 
peaked around Fc ~ 10“^, that is consistent with the 
previous value ^ 3 • 10“^ as migrants come from O(IO^) 
different countries. 

We Anally notice that Fc seems to slightly vary with the 
size of the hosting population, consistently with expected 
finite size effects. 

Lastly, we checked that there is a clear correlation be¬ 
tween the critical value Fc, obtained for trades between 
p and /, and the size N 2 of the community of migrants 
hailing from / and resident in p. This linear correlation 
is confirmed for the four largest cities we analyzed in de¬ 
tail (i.e. Madrid, Barcelona, Valencia, Sevilla) as shown 
in Fig. P!q| 


3. The relation between migrants and produets 
diver sifieation 


xl0^ 



FIG. 6: Trades performed by the province of Madrid with dif¬ 
ferent foreign countries as a function of the related immigrant 
density: different countries are depicted in different colours as 
specified by the legend. Data (bullets) are fitted via Eq. flT 


(solid line). The foreign countries considered are those where 
F spans over the largest interval in such a way that fits can 
be more accurate. 



Having proved that the amount of trades is positively 
influenced by migration, we still have to check that also 
the diversification of exports is enhanced, namely, that 
migration plays a significant role in the modern theory 
of Economical Complexity. 

In order to keep this analysis as simple as possible, we 
do not deal with recent complexity measures [7l|8j|33], but 
we follow the simplest possible route (leaving for future 
works possible improvements). 

The export portfolio of a province is composed of prod¬ 
ucts and destinations. That is, a province can export sev¬ 
eral products to a single destination or export the same 
product to several destinations. Thus, the basic unit in 
the export portfolio is a product-destination pair. We de¬ 
fine K as the total number of product-destination pairs in 
the export portfolio of a province. Products are defined 


FIG. 7: Trades performed by the province of Barcelona with 
different foreign countries as a function of the related im¬ 
migrant density: different countries are depicted in different 
colours as specified by the legend. Data (bullets) are fitted 
via Eq. (solid line). The foreign countries considered are 
those where F spans over the largest interval in such a way 
that fits can be more accurate. 

using the HS1996 product classification m- Destina¬ 
tions are defined as countries with more than 1 million 
population in 2010. There are 4507 products and 154 
countries, so the total number of product-destination K 
pairs is 694078. 

To account for the distribution of export sales across 
product-destination pairs, we use the export share of each 
product-destination pair in total export value so to cap- 
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FIG. 8: Trades performed by the province of Valencia with 
different foreign countries as a function of the related im¬ 
migrant density: different countries are depicted in different 
colours as specified by the legend. Data (bullets) are fitted 
via Eq. (solid line). The foreign countries considered are 
those where T spans over the largest interval in such a way 
that fits can be more accurate. 
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FIG. 9: Gritical density Tc estimated by htting data for trad¬ 
ing versus T according to the theoretical law Eq. m Anal¬ 
ysis are performed for the four largest provinces: Madrid, 
Barcelona, Valencia, Sevilla depicted in different symbols and 
colors as explained by the legend. We added Sevilla to conhrm 
with a fourth point the trend depicted by the first three cities. 
For each available foreign country / we ht the data for trades 
versus T according to the tanh law of Eq.|ll|and we derive an 
estimate for Tc with the related which are plotted in the 
topmost panel. The most reliable hts (i.e. close to 1) sug¬ 
gest that Tc are scattered around 10“^, similarly for the four 
provinces analyzed. Focusing on estimates corresponding to 
> 0.85, we build the histogram of Tc (shown in the middle 
panel) and calculate the arithmetic average to get Tc, which 
is plotted in the bottom panel as a function of the population 
of the related province. 


ture the relative importance of each pair for exports. The 
Herfindahl index Nh m is a simple calculation of con¬ 



FIG. 10: The values of Tc, obtained by fitting data for trades 
between p and / are related to the size N 2 of the commu¬ 
nity of migrants hailing from / and resident in p. Here we 
focused on the four largest provinces and for each we show 
data stemming from a binning procedure, in such a way that 
the error bars represents the standard deviation for the data 
points pertaining to the same bin. 



2000 2005 2010 

year 


FIG. 11: In the main plot bullets represent the value of the 
normalized diversihcation index Uh for different province and 
different years as a function of T. Green squares represent 
binned data and the solid red line is the related best ht. This 
is a linear curve (in log-log scale) y = pix -\- p 2 ^ with pi = 
—0.21 ±0.01 and p 2 = —5.47±0.01. The htting has also been 
performed for data of nu pertaining to any single province and 
any single year, hence obtaining pi(y,p). These values have 
been averaged over the provinces to get which is shown 

in the inset (the line is a guide for the year). This plot shows 
that the monotonicity of rih with respect to T (i.e. pi < 0) 
is robust with respect to the year; the same holds even when 
we average over the year, namely it is robust with respect to 
the province. 


centration of exports that uses such export shares: the 
larger the number Nh^ the more concentrated (less diver¬ 
sified) the export portfolio of the province is. Therefore, 
if migrants do really contribute to diversification of ex¬ 
ports, we should expect a negative correlation between 
Nh and F. More precisely, the Nh index is calculated as 



where x is the value of export in product-destination i 
and X is the total value of exports. One can further 
normalise Nh to get an index uh whose values lie be¬ 
tween 0 and 1. Results are shown in Fig. pT] where the 
negative correlation between ny and the percentage of 


































migrants within the province is manifest. Thus, at least 
for small percentages^ that is T ^ 7, there is a positive 
correlation between export portfolio diversification and 
the density of migrants in a particular province. We can 
therefore derive that migrants act as facilitators of trade 
by reducing international transaction costs. 


B. PART TWO: Insights from Graph Theory 

The interaction between natives and immigrants was 
described in terms of a bipartite graph (see Fig. 3, left 
panel). The statistical mechanics analysis shows that if 
the local agents i and j both interact with some foreign- 
born individual /i, i.e. 7^ 0, then the agents i and 

j can be thought of as directly interacting via an effec¬ 
tive coupling Jij (see Fig. 3, right panel and 

Eq.|^. We now focus on such emergent network, referred 
to as Q and, through calibration with available data, we 
try to infer information for the test case of Spain. 


1. A glance at the theory 


The topological properties of Q have been formerly 
mathematically investigated in [isiinisn] and here we 
review the main points. 

A global characterization of the graph Q can be at¬ 
tained in terms of the average link probability p\ consid¬ 
ering a generic couple of nodes, say i and j, keeping a 
mean-field perspective, we can write 


N2 

p = 1 _ n [1 - ner = i)p (7 = i)] (13) 


= 1 


N^ej 


(14) 


where in Eq. (13) the term in the square brackets rep¬ 


resents the probability that the contribution in the 
sum is equal to zero and the product over /jl returns 
the probability that all entries /i = 1,..., A2 are null such 
that, finally, the complementary of this quantity provides 
the probability that at least one entry is non-null, that 
is, that Jij > 0; in Eq. (14) we used the homogeneity of 
pattern entries ([^ and the definition of 7 ([^. 

The average degree of Q therefore reads as d = pNi . 

Now, as 0 and f are tuned, the emerging graph can 
range from fully-connected to completely disconnected 
[291 [30]. Erom a mean-field perspective, we can distin¬ 
guish the following topological regimes: 


• 0 < 1/2, p ^ 1, d ^ N ^ Eully connected 
(weighted) graph. 


• d = l/2, p^l — e d = 0{N) ^ Linearly 

extensive degree. 


• 1/2 < 6> < 1, p - d = 0(A2 (i-^)) 

^ Extreme dilution regime: limiv^c»d“^ = 
limAT^cxD d/N = 0. 

.(9 = 1, p - d = 0{N^) => Sparse 

(weighted) graph; ^^7 = 1 corresponds to the per¬ 
colation threshold. 

Summarizing, large values of 0 determine a discon¬ 
nected graph with vanishing average degree. Therefore, 0 
coarsely controls the connectivity regime of the network, 
while f and 7 allow a finer tuning. 

As the graph Q is meant to describe the mutual inter¬ 
actions among the decision makers inside a society, it is 
worth investigating whether it also exhibits any of the 
small-world hallmarks. Indeed, as shown in [25l[29l|30|, 
this is the case: for instance, in the proper parameter 
range, Q is shown to display a small diameter and a high 
clustering coefficient. In fact, the definition in Eq. 
(i.e. the Hebbian kernel) implicitly endows couplings 
with “transitivity”: if i and j are connected as they share 
acquaintances among immigrants, and the same holds for 
i and z, then j and z are also likely to share any acquain¬ 
tance. Otherwise stated, interactions based on sharing 
(i.e., matching non-null entries) intrinsically generate a 
clustered society. 

Up to now we just focused on the bare topology, yet 
the graph Q is weighted and we can wonder whether, even 
from this perspective, the graph exhibits typical features 
of social networks. 

In particular, according to the strength of weak ties 
theory by Granovetter [23l [24] , the degree of overlap of 
two individuals’ neighbourhood varies directly with the 
strength of their tie to one another. If the two individu¬ 
als are acquaintances (rather than close friends), there is 
little overlap. Consistently, in the graph Q weak ties con¬ 
nect individuals sharing a small number (possibly only 
one) of connections in the immigrant community. 

Einally, as shown in [251I35H39] . weak ties also turn out 
to be crucial in order to maintain the network connected: 
by cutting (a relatively small number of) weak link the 
network gets fragmented into several components. 


2. Inferring the topological properties 

The parameters into play are 7,6> and their val¬ 
ues determine the topology of the emergent network and 
are also expected to affect the growth of trades (see e.g. 
Eq. [^. Let us now try to estimate them starting from 
empirical data. 

As for 6>, we can derive it through an indirect measure: 
we expect that the number of links between locals and 
immigrants is lower bounded by the number of mixed 
marriages Mmixed- In fact, a mixed marriage yields, in 
general, several “mixed acquaintances” between the fam¬ 
ily members and the friends of the two parties. In com¬ 
plete generality, the probability of mixed marriage Pmixed 
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FIG. 12: Data on mixed marriages for each province along 
the years 1998-2012 are drawn from the local offices of Vital 
Records and Statistics (Registro Civil) and divided by the 
related size N 2 of the immigrant community. Raw data (blue 
bullets) are properly binned (green squares) to highlight the 
effective behaviour with respect to the overall size N of the 
related province. The red line shows the lack of dependence 
on iV, for Mmixed/^ 2 , in the large N limit. 


also scales with N, that is Pmixed r\j with 0 > 8, 

therefore, we can write 

^mixed ^ A^l X A^2 X Pmixed ^ A^l X A^2 X (15) 


from which 


^mixed at ^-0 
-—- ^ 

N2 


(16) 


Mixed marriages in Spain have been thoroughly investi¬ 
gated in nua and from those data we can fit the ratio 

^mixed/^2 ^ inferring an estimate for 0. As shown 

in Fig. the number of normalized mixed marriages is 
roughly constant with respect to N, that is ^ ~ 1. As a 
consequence, 0 <1. 

Now, a value of 0 strictly smaller that 1 would imply 
that the number of connections between the two parties 
grows indefinitely with N (or, analogously, with Ni or 
N 2 ), and this is certainly not realistic (it would imply 
infinite energy in order to sustain such a network and the 
linear extensivity of its related thermodynamics would 
breaks down). Thus, the experimental argument for the 
lower bound coupled with the theoretical argument for 
the upper bound implies 6> = 1 (as intuitive). 

Finally, we need to estimate According to Eq. 
and having fixed ^ = 1, ^ represents the average number 
of local acquaintances displayed by an immigrant. In our 
analysis we bound ^ in between 1 (we expect that any im¬ 
migrant has at least one link with the local community) 
and 20: there are several sociological studies trying to 
estimate the average number of acquaintances (familiars 
and/or friends) of a member of societies. In particular, 
in [40l Si] this analysis is performed in Spain finding that 
this number is ^ = 0(10), similarly to other European 
countries. 




Valencia Barcelona Madrid 



0.05 0.1 0.15 




FIG. 13: Upper panel: Gomparison between cer and c, as a 
function of F and The empirical values of F for the test- 
case provinces are also shown. Lower panel: Average size 
(s) of the giant component obtained by bond-percolating Q, 
being 1 — / the fraction of links deleted. Two processes are 
compared: random dilution (links to be deleted are extracted 
randomly) and deterministic dilution (links to be deleted are 
chosen starting from those with lower weight). Remarkably, in 
the latter case, by deleting the weakest links corresponding to 
a small fraction 1 — / of the overall links, the graph already 
gets fragmented in several components. See [25] for more 
details. 


According to these estimates for 6>, and 7 we expect 
a sparse graph and we can check whether the emergent 
graph is indeed clustered. In Fig. 13 (upper panel) we 
show the ratio between the average clustering coefficient 
c(7, ^) measured in a numerical realization of Q and the 
clustering coefficient cer of an analogous Erdos-Renyi 
graph. More precisely, 0(7,^) is measured as a function 
of ^ and 7, varied within the ranges empirically detected 
as described above; for each choice of parameters we can 
derive an average degree d which is used to estimate cer, 
namely cer = d/Ni. As long as c/c^^ > 1, ^ is highly 
clustered and this occurs in a wide region of the plane 
(F,^), especially in the region of high dilution: in the 
parameter range considered the graph Q turns out to be 
small world. 

Lastly, we address the problem of the existence of weak 
ties within the network generated by the Hebbian rule 
(eq. 5). To this task, we have numerically built over¬ 
percolated networks at various sizes and then, for each 
sample, we performed two types of dilution: the former 
is purely random, namely we delete a fraction of links 
extracted according to a uniform distribution, the latter 
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is deterministic, namely we delete links selecting those 
corresponding to the weakest coupling. We can then 
compare the results (shown in Fig. 13 , lower panel). If 
weak ties effectively play a crucial role in keeping different 
communities connected together, then the deterministic 
percolation should break the giant component first (i.e. 
at higher values of network’s connectivity) as this is the 
case, hence, at least numerically, we definitely confirm 
that these Hebbian networks are small worlds. 


a significant role in the modern theory of Economical 
Complexity. 

Further outlooks may cover more complex measures 
of product’s complexity in order to better tackle the out¬ 
lined (indirect) influence of migrants to the global market 
and other nations should be considered beyond Spain to 
give more ground to the theory as a whole. 
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III. CONCLUSIONS AND OUTLOOKS 

The recent results of Hidalgo, Klinger, Barabasi and 
Hausmann 0 El- as well as those by Pietronero, Cal- 
darelli, Gabrielli and their coworkers laiH] play as break¬ 
through in the modern theory of Economical Complex¬ 
ity: while classical economic theories prescribed special¬ 
ization in the industrial production of most developed 
countries, their investigations clearly show that nowadays 
the production of such countries are actually extremely 
diversified. However, in these papers, how diversification 
affects international trades is not deepened and this is 
the goal of the present work. 

Our approach is framed within the scaffold of Statisti¬ 
cal Mechanics, a well consolidated stochastic tool in The¬ 
oretical Physics that aims to detect emergent and collec¬ 
tive behaviors sharing attributes over the details, and it 
is supported by extensive data analysis for the test-case 
of Spain. The resulting theory plays as a new dowel in 
this modern mosaic of Economical Complexity, shedding 
lights on the way diversification of exports is achieved 
due to a continuous swarming of natives and migrants in 
interaction. These exchanges of information are funda¬ 
mental to allow firm’s holders to leverage transactional 
costs thus tacitely allowing a larger basin of firms to ap¬ 
pear on the international market. 

From a practical economical perspective, our results 
suggest the existence of a (eventually very small) critical 
threshold Pc in the percentage of migrants present in the 
host community before a boost in international trading 
is achieved, as well as a saturation effect, in agreement 
both with the Chaney distorted gravity scheme as well 
as with recent non linear models by Egger et al. [9] and 
the (related) pioneering suggestions of Could [21]. 

It is worth highlighting that, through an analogy with 
phase transitions, we can quantitatively find the prob¬ 
ability distribution of the critical threshold, that, when 
considering migrant’s from all over the world as a whole, 
is Poissonian-like distributed with peak at 0.3% of the 
whole population, while, when considering migrants from 
a specific country, decreases to values Pc ^ 10“^, whose 
scaling is in agreement with the observation that trading 
nations are r\j 10 ^: 

Summarizing, under the assumption of not so large 
migrant’s densities, the effect of immigrant’s networking 
on exports is always significant, robust and stable across 
goods. Indeed, we can safety state that migrants play 
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Appendix A: Statistical- mechanics analysis 


Let us introduce the set of order parameters 
(where the subscript p refers to a given 

province p) as 

1 

"Ip (^p) = ^ E 

where C normalizes with respect to the expected number 
of non null entries, namely 

c = = 1) = NxiN-^ = C(1 - (A2) 

Therefore, m^{cr) is the average will in international 
trading for Spanish people living in the province p that 
share the knowledge of the migrant pi. 

For the gauge-like symmetry of the model, clearly 
{m^{a)) = m for each /i = I, • • • , A2. Further, as in the 
dilution regime of empirical interest each decision maker 
ai is linked with (at least) one stranger /i, we have that 
{m{a)) = {m^{a)) = m, i.e. m is the averaged predis¬ 
position of the whole host community in international 
trading. This is because 


Ni 


{m{a)) = = (o-*) 


i=l 


Ni 






C{N) 


Ni 

C{N) 


= 1) (cT,) = (ai) 


(A3) 


In terms of these new order parameters we can write 

(A4) 
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that can be evaluated straightforwardly with a standard 
saddle point argument: 

^ = fU^rnJ (A 5 ) 

/J, fl 

a 

fl ^ fJb 

— i rh 

. e^T.% log2+JViElogcosh( )_ 

As well known, only the dominant term contributes to 
the free energy in the large N limit, hence 


where 


A({m^},{m/.}) 


= log 2 + E log cosh( 


-* E^!=1 

C(1 -7)Afi-® 


) 


+ 


2Ni 


N 2 


J2^l + 


i 


N 2 

y] m^m^. 

fl=l 


parameter as a solution of 

^ (w) ^^'"tanh(/?^(l• 

(A7) 

Evaluating explicitly the average over we get 


= b^tanh(/ 3 ^(l - 7)^A® ^im^,+ '^m^O)\ . 

\ ^ 

(A8) 

Now we can look for the solution = m for each fi: 
killing the vanishing term N^~^m (that goes to zero in 
the thermodynamic limit) we find that m obeys 

TO = (tanh(/ 32 ^(l - 'f) 7 ]m))^ , (A 9 ) 

where we defined the random variable 77 = 

Evaluating the momenta of r] is straightforward as 

( 71 ) = 7V®-i7V2Er] = 7^; 

Var[7?] = N^‘^'^-^'>N2YaT[C’'] = 0 {N^^^-^'>NN-^)^A^ 0 . 

This means that in the limit of infinite size, the signal 77 
is deterministic and thus we have 

m = tanh(;d^^^7(l — 7)771). 


Taking the sup of A we get the self-consistent relations 
of the system 


dm^A{{m^}, {7h^}) = 0 -)> 

= - N, - 


-im^ = -7) 


i i 






E 




This formula relates the expected amount of trading 
firms (and, similarly, the expected volume of interna¬ 
tional trades) with the fraction 7 of foreign-born people 
in the province considered. 

The full Hamiltonian (§ also contains an intra-party 
interaction term encoded by J, which was not considered 
in this treatment. Accounting also for this term would 
simply imply an additional term pjm in the argument 
of the hyperbolic tangent, namely 

777 = tanh(/dJ777 +/d^^^FTTi), (^10) 


that, once solved together, returns the value of the order 


where we wrote T = 7(1—7) for simplicity. 
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